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6.1.5 Random Vectors 

When dealing with multiple random variables, it is sometimes useful to use vector and matrix 
notations. This makes the formulas more compact and lets us use facts from linear algebra. In this 
section, we briefly explore this avenue. The reader should be familiar with matrix algebra before 
reading this section. When we have n random variables Xy,X 2 ,..., X n we can put them in a 

(column) vector X: 


■Wj ■ 

W 2 


X = 



x„ 


We call X a random vector. Here X is an //-dimensional vector because it consists of n random 
variables. In this book, we usually use bold capital letters such as X, Y and Z to represent a random 
vector. To show a possible value of a random vector we usually use bold lowercase letters such as 
x, y and z. Thus, we can write the CDF of the random vector X as 

f 7 (x) = F x x X ( x l> x 2> • • • ’ x h) 

1 ’ 2 ’ ’ n 

= P(X ] < x p X 2 <x 2 ,...,X n < x n ). 

If the X-s are jointly continuous, the PDF of X can be written as 

/x( x ) = fx.,x ,,... ,x( x l’ x 2» • • • » x „)- 

1 5 2 ’ ’ n 


Expectation: 

The expected value vector or the mean vector of the random vector X is defined as 
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•EX ] ■ 
EX , 


EX = 


EX„ 


Similarly, a random matrix is a matrix whose elements are random variables. In particular, we can 
have an m by n random matrix M as 



*12 •• 

• *ln 1 

*21 

*22 •• 

• *2n 

*ml 

X m2 •• 

V 

^mn 

■ 


■ 


We sometimes write this as M = [X~], which means that Xy is the element in the zth row and y'th 
column of M. The mean matrix of M is given by 


'EX H 

EX n •• 

• EX Xn 

EX 2i 

^*22 •• 

■ EX ln 


EM = 


EX rn i EX, 


m2 


EX„ 


Linearity of expectation is also valid for random vectors and matrices. In particular, let X be an n- 
dimensional random vector and the random vector Y be defined as 

Y = AX + b, 
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where A is a fixed (non-random) m by n matrix and b is a fixed m -dimensional vector. Then we 
have 

EY = AEX + b 

Also, if Xj, X 2 , •••, X k are //-dimensional random vectors, then we have 

E[X x + X 2 + ••• + X k ] = EX l + EX 2 + ••• + EX k . 


Correlation and Covariance Matrix 

For a random vector X, we define the correlation matrix, R x , as 


R x = E[XX t ] =E 


■ 4 

¥2 •• 

• X \ X n ' 


1 ex] 

E[X,X 2 \ .. 

• WJ 1 


*2 •• 

• *2*„ 

. 

ex 2 x ] 

E\X\] . . 

• E[X 2 X n \ 

Vt 

■ 

¥ •• 

A n 

1 


Wil 

■ 

E\ x ^ •• 

• E[X 2 n ] 

1 


where T shows matrix transposition. 

The covariance matrix, C x . is defined as 
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C x = E[(X - EX)(X - EX) t ] 


" (X x -EX,y 
{X 2 -EX 2 ){X x -EX x ) 


(X, EX,)(X 2 EX 2 ) 

(X 2 EX,) 2 


( x \ - EX { )(X n - EX n ) 1 
( X 2 - EX 2 ){X n - EX n ) 


[ ~ EX n)( X 1 - EX 0 ( X n ~ EX n^ X 2 ~ EX l) 


C X n~ EX n) 2 


Var(X,) 

Co\(X 2 ,X x ) 


Cov(X n , Xj) 


Cov(X l5 X 2 ) 

Var(X,) 


Cov(X ;! X 2 ) 


Cov^X,,) 

Cov(X 2 ,X„) 


Var(X„) 


The covariance matrix is a generalization of the variance of a random variable. Remember that for 
a random variable, we have Var(X) = EX~ - (EX)-. The following example extends this formula to 
random vectors. 


Example 6.11 

For a random vector X, show 


c x = R x - EXEX t 


• Solution 

o We have 

C x = E[(X - EX)(X - EX) t ] 

= E[(X - EX)(X t - EX t )] 

= E[XX t ] - EXEX t - EXEX t + EXEX t (by linearity of expectation) 
= R x - EXEX t 
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Correlation matrix of X: 

R x = E[XX t ] 
Covariance matrix of X: 


C x = E[(X - EX)(X - EX) t ] = R x - EXEX t 


Example 6.12 

Let X be an //-dimensional random vector and the random vector Y be defined as 

Y = AX + b, 

where A is a fixed m by n matrix and b is a fixed ///-dimensional vector. Show that 

c Y = ac x a t . 


• Solution 

o Note that by linearity of expectation, we have 

EY = AEX + b. 


By definition, we have 

C Y = E[(Y - EY)(Y - EY) t ] 

= E[(AX + b - AEX - b)(AX + b - AEX - b) T ] 

= £[A(X - EX)(X - EX) t A t ] 

= AE[(X - EX)(X - EX) t ]A t (by linearity of expectation) 

= ac x a t . 


Example 6.13 

Let X and Fbe two jointly continuous random variables with joint PDF 


f x ,^y) = 


,x 2 + y 0 < x,y < 1 


otherwise 


and let the random vector U be defined as 


U = 


X 

Y 


Find the correlation and covariance matrices of U. 
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• Solution 

o We first obtain the marginal PDFs ofX and Y. Note that R x = R y = 
x G R x 


ft 3 

/xW = JoJ* + 3' dy 

1 


= ;x 2 + 

2 2 


for 0 < x < 1. 


Similarly, lor y E R r , we have 


H 3 

/yCf) = J 0 2 x + -f dx 
1 

= y+ for 0 <y < 1. 

5 7 7 

From these, we obtain EX = -, EX 2 = —,EY= —, and EY 2 = —. 

. By LOTUS, we can write 



rl 3 1 2 

= J 0 gT+ ^ dy 


17 

48' 


From this, we also obtain 


Cov(X, y) = EXY-EXEY 
17 5 7 

” 48 8' 12 


1 

96' 


The correlation matrix i? v is given by 
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R v = E[UU t ] 


EX 2 EXY 
EYX EY 2 


The covariance matrix C ^ is given by 


C U 


■ Var(A) CxM X, Y )' 

Cov( Y, X) Var( Y) 


7 

17 

15 

48 

17 

5 

48 

~Y2 


' 73 1 ' 

960 _ 96 

1 11 

_ 96 144 


Properties of the Covariance Matrix: 

The covariance matrix is the generalization of the variance to random vectors. It is an important 
matrix and is used extensively. Let's take a moment and discuss its properties. Here, we use 
concepts from linear algebra such as eigenvalues and positive definiteness. First note that, for any 
random vector X, the covariance matrix C x is a symmetric matrix. This is because if C x = [c-], 

then 


Cij = Cov( X p Xj) = Cov( X p X} = Cj , 

Thus, the covariance matrix has all the nice properties of symmetric matrices. In particular, C x can 
be diagonalized and all the eigenvalues of C x are real. Here, we assume X is a real random vector, 
i.e., the X-'s can only take real values. A special important property of the covariance matrix is that 

it is positive semi-definite (PSD). Remember from linear algebra that a symmetric matrix M is 
positive semi-definite (PSD) if, for all vectors b, we have 

b T Mb > 0. 

Also, M is said to be positive definite (PD), if for all vectors b ^ 0, we have 

b T Mb > 0. 

By the above definitions, we note that every PD matrix is also PSD, but the converse is not 
generally true. Here, we show that covariance matrices are always PSD. 

Theorem 6.2. Let X be a random vector with n elements. Then, its covariance matrix C x is 
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positive semi-defmite(PSD). 

Proof. Let b be any fixed vector with n elements. Define the random variable Y as 

Y = b T (X - EX). 


We have 


0 < EY 2 
= E(YY t ) 

= b T E[(X - EX)(X - EX) T ]b 
= b T C x b. 

Note that the eigenvalues of a PSD matrix are always larger than or equal to zero. If all the 
eigenvalues are strictly larger than zero, then the matrix is positive definite. From linear algebra, 
we know that a real symmetric matrix is positive definite if and only if all its eigenvalues are 
positive. Since C x is a real symmetric matrix, we can state the following theorem. 

Theorem 6.3. Let X be a random vector with n elements. Then its covariance matrix C x is positive 
definite (PD), if and only if all its eigenvalues are larger than zero. Equivalently, C x is positive 
definite (PD), if and only if det (C x ) > 0. 

Note that the second part of the theorem is implied by the first part. This is because the determinant 
of a matrix is the product of its eigenvalues, and we already know that all eigenvalues of C x are 

larger than or equal to zero. 

Example 6.14 

Let Wand Tbe two independent Uniform (0, 1) random variables. Let the random vectors U and V 
be defined as 


W 
Y 

X+ Y 

Determine whether C t and Cy are positive definite. 

• Solution 

o Let us first find C,j. We have 


U = 


X 

X + Y 


V = 


c u 


Var(W) Co v(X,X+Y) m 

Co v(X+Y,X) Var (X+Y) 


Since Wand Y are independent Uniform (0, 1) random variables, we have 
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VarW = Var( Y) = 

Cov(X,X + Y) = Cov(X,Z) + Cov(X, y) 

1 1 

= + 0 = , 

12 12 

1 

Var(X+ Y) = Var(X) + Var(y) = 

6 

Thus, 


C U 


1 1 1 1 

n n 

l l 

12 6 


So we conclude 


del (Cu) 


1 1 
12 ' 6 
1 


144 


> 0. 


1 1 
12 ' 12 


Therefore, Cy is positive definite. For C v , we have 


C v = 


Var(Z) 
Cov(T, X) 
Cov(X+ Y,X) 


Co v(X,Y) Co v(X,X+Y) 

Var(Y) Co v(Y,X+ Y) 

Cov(X+ Y, Y) Var(X+ Y) 


1 

12 

0 

1 

12 


0 

1 

12 

1 

12 


1 

12 

1 

12 

1 

6 


So we conclude 
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det (C v ) 


1 

f 1 

1 

1 

1 \ 

1 ( 

1 

1 \ 

— 

— 

— — 

— 

— 

-0 + — 0- 

— 

— 

12! 

V 12 

6 

12 

12 1 

12 V 

12 

12 1 


0 . 


Thus, Cy is not positive definite (we already know that it is positive semi-definite). 


Finally, if we have two random vectors, X and Y, we can define the cross correlation matrix of X 
and Y as 

r xy = E[XY t ] 

Also, the cross covariance matrix of X and Y is 

C XY = E[(X - EX)(Y - EY) t ], 


Functions of Random Vectors: The Method of 
Transformations 

A function of a random vector is a random vector. Thus, the methods that we discussed regarding 
functions of two random variables can be used to find distributions of functions of random vectors. 
For example, we can state a more general form of Theorem 5.1 (method of transformations). Let us 
first explain the method and then see some examples on how to use it. Let X be an //-dimensional 
random vector with joint PDF / x (x). Let G : R" *-> R" be a continuous and invertible function with 

continuous partial derivatives and let H = G 1 . Suppose that the random vector Y is given by 
Y = G(X) and thus X = G _1 (Y) = H( Y). That is, 


•X 1 ■ 


'H x {Y x ,Y 2 ,. 

• ’ Y fr ) 

*2 


H 2 (Y { ,Y 2 ,. 

■. Y n ) 

■ ■ 


H n (Yl,Y 2 ,. 

a 

■. 

■ 


Then, the PDF of Y J Yi ,y 2 , ... ,y^\> yi> ■ • • is given b Y 
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f Y (y) = fxi H (y))\ J 


where J is the Jacobian of// defined by 


8H j 

8H l 

8H X 

8y { 

dy 2 


8H 2 

8H 2 

8H 2 

8y\ 

dy 2 


8H„ 

8H„ 

8H„ 

8y\ 

dy 2 

' dy n 


and evaluated at (y l ,y 2 > ■ ■ ■ ,y n )- 


Example 6.15 

Let X be an //-dimensional random vector. Let A be a fixed (non-random) invertible n by /? matrix, 
and b be a fixed //-dimensional vector. Define the random vector Y as 

Y = AX + b. 


Find the PDF of Y in terms of PDF of X. 

• Solution 

o Since A is invertible, we can write 


X=A~ x (J~b). 


We can also check that 


J = det (A ! ) 


1 

det (A)' 


Thus, we conclude that 
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Normal (Gaussian) Random Vectors: 

We discussed two jointly normal random variables previously in Section 5.3.2. In particular, two 
random variables X and Y are said to be bivariate normal or jointly normal, if aX + bY has normal 
distribution for all a, b £ R. We can extend this definition to //jointly normal random variables. 
Random variables X 1 ,X 2 ,..., X n are said to be jointly normal if, for all a 1 ,a 2 ,..., a n £ R, the 

random variable 


ci x X x + a 2 X 2 + ... +a tl X n 


is a normal random variable. 

As before, we agree that the constant zero is a normal random variable with zero mean and 
variance, i.e., N( 0, 0). When we have several jointly normal random variables, we often put them in 
a vector. The resulting random vector is a called a normal (Gaussian) random vector. 

A random vector 


•X 1 ■ 

X 2 


X = 


A 


is said to be normal or Gaussian if the random variables X\, X 2 ,..., X n are jointly normal. 

To find the general form for the PDT of a Gaussian random vector it is convenient to start from the 
simplest case where the X-s are independent and identically distributed (i.i.d.), X i ~ A(0, 1). In this 

case, we know how to find the joint PDF. It is simply the product of the individual (marginal) 

PDFs. Let's call such a random vector the standard normal random vector. So, let 


'Zt 1 

z 2 


z = 
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where Z -s are i.i.d. and Z ; ~ MO, 1). Then, we have 

/z( z ) =/Zj,Z 2 , . . . ,Z, ; ( Z 1’ Z 2’ • • • ’ z ») 


n 



1 


T ex P 


(27r) 2 

1 

(2n)i 


rZ-f 

'/= 1 


-exp < - -z z 


For a standard normal random vector Z, where Z-'s are i.i.d. and Z- ~ N(0, 1), the PDF is given by 


1 

/z( z ) = -f ex P 

(2tt)2 



Now, we need to extend this formula to a general normal random vector X with mean m and 
covariance matrix C. This is very similar to when we defined general normal random variables 
from the standard normal random variable. We remember that if Z ~ MO: 1): then the random 
variable X = aZ + // has N(ju, a~) distribution. We would like to do the same thing for normal 
random vectors. 


Assume that I have a normal random vector X with mean m and covariance matrix C. We write 
X ~ M m , C). Further, assume that C is a positive definite matrix.(The positive definiteness 
assumption here does not create any limitations. We already know that C is positive semi-definite 
(Theorem 6.2), so det (C) > 0. We also know that C is positive definite if and only if det (C) > 0 
(Theorem 6.3). So here, we are only excluding the case det (C) = 0. If det (C) = 0, then you can 
show that you can write some X-s as a linear combination of others, so indeed we can remove them 

from the vector without losing any information.) Then from linear algebra we know that there 
exists an n by n matrix Q such that 

QQ r =l (I is the identity matrix), 

C = QDQ r , 

where D is a diagonal matrix 
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•d n 0 
0 d 2 2 


D = 


0 ■ 

0 


0 


0 


The positive definiteness assumption guarantees that all d u 's are positive. Let's define 


■ 0 • • • o ■ 


i 

D 2 


0 



0 



11 11 T 

We have D 2 D 2 = D and D 2 = D 2 . Also define 

1 

A = QD~-Q t . 


Then, 


AA r = A r A = C. 

Now we are ready to define the transformation that converts a standard Gaussian vector to 
X ~ N{ m, C). Let Z be a standard Gaussian vector, i.e., Z ~ N( 0,1). Define 

X = AZ + m 

We claim that X ~ N( m, C). To see this, first note that X is a normal random vector. The reason is 
that any linear combination of components of X is indeed a linear combination of components of Z 
plus a constant. Thus, every linear combination of components of X is a normal random variable. It 
remains to show that EX = m and C x = C. First note that by linearity of expectation we have 
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EX = E[ AZ + m] 
= AE\Z\ + m 

= m. 


Also, by Example 6.12 we have 

C x = AC ^ 

= AA t (since C z = I) 

= C. 


Thus, we have shown that X is a random vector with mean m and covariance matrix C. Now we 
can use Example 6.15 to find the PDF ofX. We have 

1 

/ x (x) = - f 7 (A ! (x - m)) 

|det(A)r zV v J 

= --expl--(A _ 1 (x-m)) r (A _ 1 (x-m))l 

(2tt) T | det (A) | l 2 ) 

= -„ exp |~~(x ~ m) 7 A~ 7 A~ 1 (x ~ m)| 

(2 n)^ det (C) l 2 ) 

= -„ expl--(x-m) r C~ 1 (x-m)|. 

(2tt) 2 V det (C) l 2 J 


For a normal random vector X with mean m and covariance matrix C, the PDF is given by 


/x( x )= -» . exp |--(x - m) 7 C '(x - m) 1 (6.1) 

(2 7t)2^ det C l 2 J 


Example 6.16 

Let Wand Tbe two jointly normal random variables with A ~ N(ji x , a x ), Y ~ N(ju y , cry), and 


p(X, Y) = p. Show that the above PDF formula for PDF of 
Definition 5.4 in Section 5.3.2. That is, 


'X' 

Y 


is the same as f x v (x, y) given in 
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fxy(x>y) = 




P 


1 ri x ~PX\ ,y~PY\ (x ~ Px)(y ~ Py) 


exp ■ 


2(1 ~ P) L V 


r / rx \ /s ‘ Y \ 

[br) 2 + br) 


2/3" 


°X <T Y 


• Solution 


o Both formulas are in the form ae 2 b . Thus, it suffices to show that they have the same 
a and b. Here we have 


We also have 


m = 


Px 

Py 


" Var(T) Cov(X Y) " 


r 2 

17 X P°X a Y 

Cov( Y, X) Var(T) 


2 

P°X a Y °Y 


From this, we obtain 

det C = (T^u~y{ 1 - p 2 ). 


Thus, in both formulas for PDF a is given by 


a = 


2nox&y^ 1 


P 


Next, we check b. We have 


C~ l = 


1 

(rl<j 2 y{l -p 2 ) 


2 

a Y 

-P^xPy 


P ct x ct y 

2 

°x 


Now by matrix multiplication we obtain 
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(x - m ) T C 1 (x - m) = 


<i°y ( 1 _ P 2 ) 


' X-Px ' 

T 

y~p Y 


■ 



a Y P G X a Y 


P°xPy °X 


X~Px 

y~ Py 


i 


2(1-/On 

which agrees with the formula in Definition 5.4. 


x-p x \ 2 (y~ py \ 2 (x -p x )(y ~p y ) 

~2p- 


a X°Y 


Remember that two jointly normal random variables X and Y are independent if and only if they are 
uncorrelated. We can extend this to multiple jointly normal random variables. Thus, if you have a 
normal random vector whose components are uncorrelated, you can conclude that the components 
are independent. To show this, note that if the X-s are uncorrelated, then the covariance matrix C x 

is diagonal, so its inverse C x 1 is also diagonal. You can see that in this case the PDT (Equation 
6.1) becomes the products of marginal PDFs. 

If X = \Xy,X 2 ,... ,X n \ T is a normal random vector, and we know Co v(X f ,X) = 0 for all i p j, 
then Xj,X 2 ,..., X n are independent. 

Another important result is that if X = [X^,X 2 ,... ,X n \ is a normal random vector then 

Y = AX + b is also a random vector because any linear combination of components of Y is also a 
linear combination of components of X plus a constant value. 

If X = [X 1 ,X 2 ,..., X n \ 7 is a normal random vector, X ~ A(m, C), A is an m by n fixed matrix, 
and b is an m -dimensional fixed vector, then the random vector Y = AX + b is a normal random 
vector with mean A EX + b and covariance matrix ACA . 

Y ~ N(AEX + b, ACA 7 ) 


<— previous 
next —> 
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